Q-Learning with Double Progressive Widening: Application to Robotics
نویسندگان
چکیده
Discretization of state and action spaces is a critical issue in Q-Learning. In our contribution, we propose a real-time adaptation of the discretization by the progressive widening technique which has been already used in bandit-based methods. Results are consistently converging to the optimum of the problem, without changing the parametrization for each new problem.
منابع مشابه
Q Learning based Reinforcement Learning Approach to Bipedal Walking Control
Reinforcement learning has been active research area not only in machine learning but also in control engineering, operation research and robotics in recent years. It is a model free learning control method that can solve Markov decision problems. Q-learning is an incremental dynamic programming procedure that determines the optimal policy in a step-by-step manner. It is an online procedure for...
متن کاملMachine Learning for Autonomous Robotic Agents
We present some results of our research in the field of Machine Learning applied to robotics problems. In particular we have investigated on: (i) the application of Learning Classifier Systems to the synthesis of robot controllers; (ii) learning of fuzzy controllers; (iii) learning of purposeful representations of the environment; (iv) and the application of versions of Q-learning to robot trai...
متن کاملAdding Double Progressive Widening to Upper Confidence Trees to Cope with Uncertainty in Planning Problems
Current state of the art methods in energy policy planning only approximate the problem (Linear Programming on a finite sample of scenarios, Dynamic Programming on an approximation of the problem, etc). Monte-Carlo Tree Search (MCTS [3]) seems to be a potential candidate to converge to an exact solution of these problems ([2]). But how fast, and how do key parameters (double/simple progressive ...
متن کاملContinuous Upper Confidence Trees
Upper Confidence Trees are a very efficient tool for solving Markov Decision Processes; originating in difficult games like the game of Go, it is in particular surprisingly efficient in high dimensional problems. It is known that it can be adapted to continuous domains in some cases (in particular continuous action spaces). We here present an extension of Upper Confidence Trees to continuous st...
متن کامل-Learning: A Robotics Oriented Reinforcement Learning Algorithm
We present a new reinforcement learning system more suitable to be used in robotics than existing ones. Existing reinforcement learning algorithms are not speci cally tailored for robotics and so they do not take advantage of the robotic perception characteristics as well as of the expected complexity of task that robots are likely to face. In a robot, the information about the environment come...
متن کامل